Skip to content

test: re-enable sql_hive-1 for Spark 4.0 and fix two small failures#4047

Merged
andygrove merged 4 commits intoapache:mainfrom
andygrove:test-hive1-spark4-retry
Apr 24, 2026
Merged

test: re-enable sql_hive-1 for Spark 4.0 and fix two small failures#4047
andygrove merged 4 commits intoapache:mainfrom
andygrove:test-hive1-spark4-retry

Conversation

@andygrove
Copy link
Copy Markdown
Member

@andygrove andygrove commented Apr 23, 2026

Which issue does this PR close?

Closes #2946.
Closes #4049.

Rationale for this change

The sql_hive-1 job for Spark 4.0 has been excluded from CI since #2946 was filed (hang/timeout on the Hive suite with JDK 17). A re-run on current main shows the hang no longer reproduces: the job completes in about 63 minutes. It does surface two small issues, which this PR also fixes, so the job can be re-enabled and kept green.

What changes are included in this PR?

  1. .github/workflows/spark_sql_test.yml: remove the exclude matrix entry that skipped sql_hive-1 for spark-4.0.1 / java 17 / auto.
  2. spark/src/main/spark-4.0/org/apache/spark/sql/comet/shims/ShimSparkErrorConverter.scala: the FileNotFound case was producing a SparkFileNotFoundException with error class _LEGACY_ERROR_TEMP_2055, which was removed in Spark 4.0. Delegate to QueryExecutionErrors.fileNotExistError(path, cause) so the error carries FAILED_READ_FILE.FILE_NOT_EXIST, which is what HiveMetadataCacheSuite (and similar checkError assertions) expect.
  3. dev/diffs/4.0.1.diff: drop the hunk that commented out assume(new java.io.File(jarPath).exists) in HiveUDFDynamicLoadSuite. Spark 4.0 no longer ships hive-test-udfs.jar, so restoring the upstream assume cancels the five UDF tests cleanly instead of running them without the required classes and failing with CANNOT_LOAD_FUNCTION_CLASS.

How are these changes tested?

The CI run on this PR exercises the full sql_hive-1 job for Spark 4.0. The previously failing suites behave as follows:

  • HiveMetadataCacheSuite (4 failures): now pass against the corrected error class.
  • HiveUDFDynamicLoadSuite (5 failures): now cancelled on Spark 4.0 (jar absent); still run on Spark 3.4/3.5 (jar present).

The Spark 4.0 `ShimSparkErrorConverter` was converting native FileNotFound
errors into a `SparkFileNotFoundException` with error class
`_LEGACY_ERROR_TEMP_2055`, but that error class was removed in Spark 4.0.
Throwing it triggers an internal error ("Cannot find main error class")
and fails tests such as `HiveMetadataCacheSuite` that assert on
`FAILED_READ_FILE.FILE_NOT_EXIST`.

Delegate to `QueryExecutionErrors.fileNotExistError`, which is the 4.0
replacement for `readCurrentFileNotFoundError` and produces the expected
error class and `path` parameter.
The 4.0.1 Spark diff commented out the `assume(new java.io.File(jarPath).exists)`
guard in HiveUDFDynamicLoadSuite. The jar (hive-test-udfs.jar) is not
shipped with Spark 4.0, so without the guard the five UDF/UDAF/UDTF tests
run without the required class and fail with CANNOT_LOAD_FUNCTION_CLASS.

Drop the hunk so upstream's assume() is preserved. The tests are cancelled
on Spark 4.0 (jar missing) and continue to run on Spark 3.4/3.5 (jar
present), matching upstream behavior.

Closes apache#4049.
@andygrove andygrove changed the title test: re-enable sql_hive-1 for Spark 4.0 to check if #2946 still reproduces test: re-enable sql_hive-1 for Spark 4.0 and fix surfaced failures Apr 23, 2026
@andygrove andygrove changed the title test: re-enable sql_hive-1 for Spark 4.0 and fix surfaced failures test: re-enable sql_hive-1 for Spark 4.0 and fix two small failures Apr 23, 2026
@andygrove andygrove marked this pull request as ready for review April 23, 2026 15:24
@andygrove andygrove marked this pull request as draft April 23, 2026 17:38
The upstream assume(jarPath.exists) runs during class construction
(inside udfTestInfos.foreach, before test() registers a case), so when
hive-test-udfs.jar is absent - as it is on the v4.0.1 release tag - the
TestCanceledException propagates out of <init> and ScalaTest marks the
whole suite as aborted, failing the job. Mix in IgnoreCometSuite so the
five tests are reported as ignored under Comet, and comment out the
constructor-time assume so it no longer aborts the suite.
@andygrove andygrove marked this pull request as ready for review April 24, 2026 00:19
Copy link
Copy Markdown
Contributor

@comphead comphead left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

its LGTM, @parthchandra FYI

errorClass = "_LEGACY_ERROR_TEMP_2055",
messageParameters = Map("message" -> s"File $path does not exist")))
QueryExecutionErrors
.fileNotExistError(path, new FileNotFoundException(s"File $path does not exist")))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think Spark 4.0 will use - FAILED_READ_FILE.FILE_NOT_EXIST instead of the legacy error class. Not a blocker (See https://github.com/apache/spark/blob/c241d5ad4a2372bbddc7dd8339987a09f501dc36/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala#L879)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that is what this code change achieves? Maybe I am not understanding something here though

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh right. The split line threw me off!

@andygrove andygrove merged commit 2cb6142 into apache:main Apr 24, 2026
178 of 179 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

3 participants